1 | © Copyright 2024 Zilliz
1
1 | © Copyright 9/25/23 Zilliz
1 | © Copyright 9/25/23 Zilliz
Speaker
Jiang Chen
Ecosystem & Developer Experience
jiang.chen@zilliz.com
@jiangc1010
2 | © Copyright 2024 Zilliz
2
Multimodal RAG with Milvus and GPT4o
Jiang Chen @ Zilliz
3 | © Copyright 2024 Zilliz
3
01 Multi-modal Embeddings
CONTENTS
02 Multi-modal Search in Milvus
Demo of Multi-modal RAG with Milvus
03
4 | © Copyright 2024 Zilliz
4
Multi-modal Embeddings
5 | © Copyright 2024 Zilliz
5
Information Retrieval IR) at Multi-modal Setting
● CLIP 2021
6 | © Copyright 2024 Zilliz
6
● independently process image and text modalities
○ A classifier at heart
Information Retrieval IR) at Cross-modality Setting
7 | © Copyright 2024 Zilliz
7
Visualized BGE
● establishes the in-depth fusion of text and image data
● enables the preservation of the original performance of text
embedding
○ as the text encoder is fully fixed while the visual tokens are incorporated
8 | © Copyright 2024 Zilliz
8
Visualized BGE - training technique
9 | © Copyright 2024 Zilliz
9
MagicLens
10 | © Copyright 2024 Zilliz
10
MagicLens
11 | © Copyright 2024 Zilliz
11
Multi-modal Search in Milvus
12 | © Copyright 2024 Zilliz
12
Retrieval-Augmented Generation
13 | © Copyright 2024 Zilliz
13
14 | © Copyright 2024 Zilliz
14
15 | © Copyright 2024 Zilliz
15
Data Model Design in Milvus
16 | © Copyright 2024 Zilliz
16
Demo of Multi-modal RAG with
Milvus
17 | © Copyright 2024 Zilliz
17
Useful Links
● CLIP https://arxiv.org/abs/2103.00020
● Visualized BGE https://arxiv.org/abs/2406.04292
● MagicLens: https://arxiv.org/abs/2403.19651
● Multimodal RAG with Milvus 🖼:
https://milvus.io/docs/multimodal_rag_with_milvus.md
● Image Search with Milvus
https://milvus.io/docs/image_similarity_search.md
● Multimodal Image Search online demo:
https://multimodal-demo.milvus.io/
● Hybrid Image and Text search with Multi-vector in Milvus:
https://github.com/yiwen92/Milvus_hybridsearch/blob/main/multi-
modal-demo/demo.ipynb
18 | © Copyright 2024 Zilliz
18
T H A N K Y O U
@jiangc1010

Multimodal RAG with Milvus and GPT-4o Webinar

  • 1.
    1 | ©Copyright 2024 Zilliz 1 1 | © Copyright 9/25/23 Zilliz 1 | © Copyright 9/25/23 Zilliz Speaker Jiang Chen Ecosystem & Developer Experience jiang.chen@zilliz.com @jiangc1010
  • 2.
    2 | ©Copyright 2024 Zilliz 2 Multimodal RAG with Milvus and GPT4o Jiang Chen @ Zilliz
  • 3.
    3 | ©Copyright 2024 Zilliz 3 01 Multi-modal Embeddings CONTENTS 02 Multi-modal Search in Milvus Demo of Multi-modal RAG with Milvus 03
  • 4.
    4 | ©Copyright 2024 Zilliz 4 Multi-modal Embeddings
  • 5.
    5 | ©Copyright 2024 Zilliz 5 Information Retrieval IR) at Multi-modal Setting ● CLIP 2021
  • 6.
    6 | ©Copyright 2024 Zilliz 6 ● independently process image and text modalities ○ A classifier at heart Information Retrieval IR) at Cross-modality Setting
  • 7.
    7 | ©Copyright 2024 Zilliz 7 Visualized BGE ● establishes the in-depth fusion of text and image data ● enables the preservation of the original performance of text embedding ○ as the text encoder is fully fixed while the visual tokens are incorporated
  • 8.
    8 | ©Copyright 2024 Zilliz 8 Visualized BGE - training technique
  • 9.
    9 | ©Copyright 2024 Zilliz 9 MagicLens
  • 10.
    10 | ©Copyright 2024 Zilliz 10 MagicLens
  • 11.
    11 | ©Copyright 2024 Zilliz 11 Multi-modal Search in Milvus
  • 12.
    12 | ©Copyright 2024 Zilliz 12 Retrieval-Augmented Generation
  • 13.
    13 | ©Copyright 2024 Zilliz 13
  • 14.
    14 | ©Copyright 2024 Zilliz 14
  • 15.
    15 | ©Copyright 2024 Zilliz 15 Data Model Design in Milvus
  • 16.
    16 | ©Copyright 2024 Zilliz 16 Demo of Multi-modal RAG with Milvus
  • 17.
    17 | ©Copyright 2024 Zilliz 17 Useful Links ● CLIP https://arxiv.org/abs/2103.00020 ● Visualized BGE https://arxiv.org/abs/2406.04292 ● MagicLens: https://arxiv.org/abs/2403.19651 ● Multimodal RAG with Milvus 🖼: https://milvus.io/docs/multimodal_rag_with_milvus.md ● Image Search with Milvus https://milvus.io/docs/image_similarity_search.md ● Multimodal Image Search online demo: https://multimodal-demo.milvus.io/ ● Hybrid Image and Text search with Multi-vector in Milvus: https://github.com/yiwen92/Milvus_hybridsearch/blob/main/multi- modal-demo/demo.ipynb
  • 18.
    18 | ©Copyright 2024 Zilliz 18 T H A N K Y O U @jiangc1010